Categories

Versions

You are viewing the RapidMiner Studio documentation for version 10.0 - Check here for latest version

Aggregate Token Length (Text Processing)

Synopsis

Extracts and aggregates the lengths of all tokens of a document. The result will be added as new meta data.

Description

This operator iterates over all tokens of a given document and aggregates the length of each token with the aggregation function specified by the user. The resulting value will be added as a numerical meta data entry. The name of the entry can be defined by the user.

Input

  • document

    The document port.

Output

  • document

    The document port.

Parameters

  • metadata_keyThe aggregated value of the tokens' length will be saved under this meta data key. The key will become the name of the attribute after document processing. Range:
  • aggregationThis specifies how the tokens' lengths are aggregated to one single number. Range: